Reducing Communication through Buffers on a Simd Architecture
نویسندگان
چکیده
Advances in wireless technology and the growing popularity of multimedia applications have brought about a need for energy efficient and cost effective portable supercomputers capable of delivering performance beyond the capabilities of current microprocessors and DSP chips. The SIMPil architecture currently being developed at Georgia Institute of Technology is a promising candidate for this task. In order to develop applications for SIMPil, a high level language and an optimizing compiler for the language are essential. However, with the recent trend of interconnect latency becoming a major bottleneck on computer systems, optimizations focusing on reducing latency are becoming more important, especially with SIMPil, as it is highly scalable. The compiler tracks the path of data through the network and buffers data in each processor to eliminate redundant communication. With a buffer size of 5, the compiler was able to eliminate 96 percent of the redundant communication for a 9x9 convolution and 8x8 DCT algorithms. With 5x5 convolution, only 89 percent elimination was observed. In terms of performance, 106 percent speedup was observed with 9x9 convolution at buffer size of 5 while 5x5 convolution and 8x8 DCT which have a much lower number of communication showed only 101 percent speedup.
منابع مشابه
Logic Simulation Using an Asynchronous Parallel Discrete-Event Simulation Model on a SIMD Machine
The Chandy-Mism-Bryant (CMB) model has been applied to logic simulation of synchronous sequential circuits using a massively parallel SIMD computer, a CM-2 Connection Machine. Seveml methods of reducing message tmf ic in a logic simulation have been adapted to the SIMD architecture of the CM-2, with the result that each method of reducing message tmf ic actually decreases the speed of the simul...
متن کاملRC-SIMD: Reconfigurable communication SIMD architecture for image processing applications
During the last two decades, Single Instruction Multiple Data (SIMD) processors have become important architectures in embedded systems for image processing applications. The main reasons are their area and energy efficiency. Often the processing elements (PEs) of an SIMD processor are only locally connected. This may result in a communication bottleneck (only access to direct neighbors). One w...
متن کاملModified SIMD architecture suitable for single-chip implementation
We describe a modified SIMD architecture suitable for single-chip integration of a large number of processing elements, such as 1,000 or more. Important differences from traditional SIMD designs are: a) The size of the memory per processing elements is kept small. b) The processors are organized into groups, each with a small buffer memory. Reduction operation over the groups is done in hardwar...
متن کاملA Sliding Memory Plane Array Processor
This paper describes a new mesh-connected SIMD architecture, called a Sliding Memory Plane (SIiM) Array Processor. On SIiM, the inter-processing element (inter-PE) communication, using the sliding memory plane, and the data input/output (I/O), using two U 0 planes, can occur without interrupting the PE’s, which greatly diminishes the communication and I/O overhead. SliM is unique in its ability...
متن کامل2 Fractal Image Compression - the Encoding Phase 2
In this work we introduce and analyze algorithms for fractal image compression on massively parallel SIMD arrays. The diierent algorithms discussed diier signiicantly in terms of their communication and computation structure. Therefore the most suited algorithm for a given architecture may be selected according to our investigations. Experimental results compare the performance of the algorithm...
متن کامل